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- The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 
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Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1)13 Responsive to communication(s) filed on 28 April 2004 . 
2a)l3 This action is FINAL. 2b)D This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 
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4) E3 Claim(s) 1-27 is/are pending in the application. 
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5) D Claim(s) is/are allowed. 

6) E3 Claim(s) 1-27 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 
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10) 13 The drawing(s) filed on 02 May 2001 is/are: a)E3 accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
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* See the attached detailed Office action for a list of the certified copies not received. 



Attachment(s) 

1) CI Notice of References Cited (PTO-892) 

2) d Notice of Draftsperson's Patent Drawing Review (PTO-948) 

3) □ Information Disclosure Statement(s) (PTO-1449 or PTO/SB/08) 

Paper No(s)/Mail Date . 



4) n Interview Summary (PTO-413) 

Paper No(s)/Mail Date. . 

5) □ Notice of Informal Patent Application (PTO-152) 

6) □ Other: . 



U.S. Patent and Trademark Office 
PTOL-326 (Rev. 1-04) 



Office Action Summary 



Part of Paper No./Mail Date 6 





Application/Control Number: 09/846,200 
Art Unit: 2655 



Page 2 



Detailed Action 



Response to Amendment 



1 . In response to the office action from 4/5/04, the applicant has submitted an amendment, 
filed 4/28/04, amending the specification and Claims without adding new matter, while arguing 
to traverse the art rejection based on the limitation regarding "detecting a natural pause between 
input subgroups (Amendment, Page 10). 

Applicant's arguments have been fully considered, however the previous rejection is 
maintained due to the reasons listed below in the response to arguments. 

2. Based on the amendments to the specification and claims, the examiner has withdrawn 
the previous objections directed towards minor informalities. 



3. Applicant's arguments have been fully considered but they are not persuasive for the 
following reasons: 

• With respect to Claims 1 and 13, the applicant argues that Power et al (U.S. 



Response to Arguments 



Patent: 5, 848? 88) does not teach "detecting a natural pause between input 



subgroups" and further argues that "subgroups of speech units are not individual 



words (Amendment, Page 10), however, as is currently claimed in Claims 1 and 
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13, pauses are detected between "subgroup[s] of speech units that form part of a 
complete speech sequence" (Amendment, Page 4, Claim 1, and Page 6, Claim 
13). As cited by Power, the claimed "complete speech sequence" is interpreted as 
being a phrase (Power, Col. 4, Lines 26-27), while the speech unit subgroups that 
from the phrase correspond to words (Col. 4, Lines 64-66). Thus, since Power 
teaches a pause detector that detects pauses following words, a word is a subgroup 
of a collection of words that comprise a complete phrase input, and no detection 
of an inter-word pause has been specifically claimed, Power sufficiently teaches 
the limitation regarding "detecting a natural pause between input subgroups". 
Also, since Power recites the pause detector as noted above and Zavoli et al (U.S. 
Patent: 6,598,016) discloses the ability to detect and display individual speech 
units (Zavoli, Col. 6, Lines 54-57), Zavoli in view of Power sufficiently teaches 
the limitation of Claim 13 regarding "a detector for detecting a natural pause after 
receiving the subgroup". 
• In response to applicant's argument that the references fail to show certain 

features of applicant's invention, it is noted that the features upon which applicant 
relies (i.e., "subgroups of speech are not individual words, " Amendment, Page 
10, suggesting that a subgroup is a portion of a word and the pauses that the 
present invention detects are inter-word pauses) are not recited in the rejected 
claim(s). Although the claims are interpreted in light of the specification, 
limitations from the specification are not read into the claims. See In re Van 
Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). 
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• Since Power teaches "detecting a natural pause between input subgroups," as 
noted above, the rejection with regards to dependent Claims 2-8, 10-12, 14-23, 
and 25-27 is maintained. 

• With respect to Claims 9 and 24, the applicant argues that Larsen ("Investigating 
a Mixed-Initiative Dialogue Management Strategy, " 1997) does not disclose "the 
ability to enter speech units using a dial pad upon repeated recognition errors," 
however provides no arguments as to why Larsen does not teach this limitation. 
Therefore, since Power teaches "detecting a natural pause between input 
subgroups," as noted above, the rejection with regards to dependent Claims 9 and 
24 is maintained. 

Therefore, the below rejection is maintained: 



4. The following is a quotation of 35 U.S. C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 



5. Claims 1-8, 10-23, and 25-27 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Zavoli et al (U.S. Patent: 6,598,016) in view of Power et al (U.S. Patent: 5,848,388). 



Claim Rejections - 35 USC §103 



With respect to Claims 1 and 13, Zavoli discloses: 
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A method and system for recognizing speech in systems that accept speech input, 
comprising: 

Receiving at least a current subgroup of speech units that form part of a complete speech 
sequence that is to be input from a user (receiving part of a complete spoken digit string from a 
microphone, CoL 6, Lines 52-57); 

Recognizing the speech units of the subgroup to provide a recognition result (displaying 
individual digits within a string to a user for correction/verification, CoL 6, Lines 54-57, upon 
recognition from a voice recognition module, CoL 5, Lines 24-28); and 

Immediately feeding back the recognition result for verification by the user (displaying 
individual digits to a user, upon recognition, for correction/verification, CoL 6, Lines 54-57). 

Zavoli does not teach the ability to detect pauses in speech, however, Power discloses: 

Detecting a natural pause between input subgroups (pause detector used to detect a pause 
following a word and further enable a parser to output a word recognition signal, CoL 4, Lines 
64-66). 

Zavoli and Power are analogous art because they are from a similar field of endeavor in 
speech-controlled interfaces capable of recognizing segments of a complete utterance. Thus, it 
would have been obvious to a person of ordinary skill in the art, at the time of invention, to 
combine the ability to detect pauses between words as taught by Power with the method and 
system utilizing recognition and display of utterance segments for correction as taught by Zavoli 
to increase speech recognition accuracy by clearly identifying pauses between the spoken digits 
to recognize individual speech segments of a complete utterance, prevent recognition error in 
confusing inter-word pauses with an end of speech, and further correct any words within a 
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sequence that may have been spoken in error or recognized incorrectly (recognition error, 
Power, Col 2, Lines 48-64). Therefore, it would have been obvious to combine Power with 
Zavoli for the benefit of obtaining a more accurate speech recognition system capable of 
recognizing segments of a complete utterance using inter-word pause detection means, to obtain 
the invention as specified in Claims 1 and 13. 

With respect to Claims 2 and 14, Zavoli further discloses: 

A speech recognition method and system, wherein said user is only prompted to repeat 
said subgroup for re-recognition and re-verification if a rejection criteria is met ("no" command 
used to delete a previous incorrect digit so that a new digit within a sequence may be reentered. 
A correct sequence is indicated upon the utterance of a "yes" command, Col. 6, Lines 58-63). 

Also, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, to further indicate that a digit has been deleted by prompting a user to enter a new 
digit, thus making the user aware that the previous digit has been deleted and may be replaced. 

With respect to Claims 3 and 20, Zavoli further recites: 

A speech recognition method and system, further comprising: 

Repeating the steps of Claim 1 for remaining input subgroups until it is determined that 
the complete speech sequence has been recognized (recognition of digits within a string until a 
yes" command is received, indicating sequence completion, Col. .6, Lines 60-65). 

With respect to Claims 4 and 21, Zavoli teaches the system utilizing recognition and 
display of utterance segments for correction, as applied to Claim 1 . Zavoli does not teach the 
ability to output a feedback utterance to a user through speech synthesis means or a pre-recorded 
message, however, Power discloses: 
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A speech recognition method and system, wherein the last step of Claim 1 is affected 
using pre-recorded prompts or via text-to-speech synthesis, (TTS) to feedback the recognition 
result (providing a synthesized prompt and response to a user for verification of a speech input, 
Col. 9, Lines 57-61). 

Zavoli and Power are analogous art because they are from a similar field of endeavor in 
speech-controlled interfaces capable of recognizing segments of a complete utterance. Thus, it 
would have been obvious to a person of ordinary skill in the art, at the time of invention, to 
combine the speech synthesis means for prompting a user to verify a speech input as taught by 
Power with the method and system enabling recognition and display of utterance segments for 
correction as taught by Zavoli in order to synthesize recognized speech segments, thus providing 
a convenient means of notifying a user of a recognition result if a user is occupied and does not 
have the means to view a text display such as in a automobile application. Also, it would have 
been obvious to one of ordinary skill in the art, at the time of invention, to use a pre-recorded 
message in place of speech synthesis, since a pre-recorded message would be an obvious 
variation of synthesis as a means of prompting a recognition result to a user. Therefore, it would 
have been obvious to combine Power with Zavoli for the benefit of obtaining a convenient 
means of notifying a user of a recognition result, to obtain the invention as specified in Claims 4 
and 21. 

With respect to Claims 5 and 18, Zavoli adds: 

A speech recognition method and system, wherein said rejection criteria is embodied as a 
negative utterance spoken by the user after receiving the fed back recognition result ("no" 
command used to delete a previously displayed digit, Col 6, Lines 58-60). 
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With respect to Claim 6, Zavoli further discloses: 

A speech recognition method, wherein said rejection criteria is embodied as a negative 
utterance spoken by the user concurrent with inputting the subgroup that is recognized in the 
third step of Claim 1 ("no" command used to delete a previously displayed digit before a digit 
sequence is completely entered and verified with a "yes" command, Col 6, Lines 58-65). 

With respect to Claims 7 and 22, Zavoli in view of Power teaches the speech recognition 
method and system capable of recognizing individual word segments (digits) through pause 
detection means to enable, upon input of a negative utterance, correction of input and recognition 
errors, as applied to Claims 2 and 14. Neither Zavoli nor Power specifically suggest prompting a 
user to input shorter speech segments upon repeated recognition errors, however, it would have 
been obvious to one of ordinary skill in the art, at the time of invention, to prompt the user to 
input shorter speech segments because if repeated recognition errors occur, shorter utterances 
have less complex speech models and thus, logically, would provide a higher level of recognition 
accuracy. Therefore, prompting a user to input easily recognized, shorter speech segments 
would provide a well known means of increasing recognition accuracy. 

With respect to Claims 8 and 23, Zavoli in view of Power teaches the speech recognition 
method and system capable of recognizing individual word segments (digits) through pause 
detection means to enable, upon input of a negative utterance, correction of input and recognition 
errors, as applied to Claims 2 and 14. Neither Zavoli nor Power specifically suggest prompting a 
user to input shorter speech segments upon repeated recognition errors as a means of training a 
user, however, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, that by speaking shorter and more easily recognized speech segments, a user would 
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gradually learn the proper way to input recognizable utterances. For instance, a user may speak a 
string of digits too quickly to be recognized correctly. By speaking each speech segment 
individually, the speaker would be able to attempt a single utterance segment multiple times and 
gradually come to understand the proper method of producing recognizable speech. Therefore, 
prompting a user to speak smaller speech segments acts a means of training that user to properly 
input an utterance. 

With respect to Claims 10 and 25, Zavoli further discloses: 

A speech recognition method and system, wherein said speech units are selected from 
any of spoken digits, spoken letters and spoken words (spoken digit recognition, Col. 6, Lines 
54-57). 

Also, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, to implement the speech recognition method taught by Zavoli in a word and letter 
recognition application, since all types are related to recognition of a speech segment within an 
utterance, to increase the usefulness of the recognizer. Furthermore, word and letter recognition 
are obvious variations of spoken digit recognition within a sequence of digits and thus, would be 
compatible with the same system, only requiring additional speech model sets. 
With respect to Claims 11 and 26, Zavoli further suggests: 
A speech recognition method and system, wherein input of a next subgroup after 
receiving the fed back recognition result indicates a correct recognition of the currently input 
subgroup ("no " command used to delete a previous incorrect digit so that a new digit within a 
sequence may be reentered. If a digit is correctly recognized the user will input another digit, 
thus the previous recognition result is considered correct since no negative command was input 
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The user may further verify the result once an entire sequence has been entered with a yes" 
command, Col 6, Lines 58-65). 

With respect to Claims 12 and 27, Zavoli teaches the method and system utilizing 
recognition and display of utterance segments for correction initialized through a negative input 
command, as applied to Claims 2 and 14. Zavoli does not teach the rejection of a speech 
segment based on a confidence level, however Power discloses: 

A speech recognition method, wherein said rejection criteria requires determining a level 
of confidence in said recognition result (rejection of a recognized speech input based on 
confidence level, Col 9, Lines 54-57). 

Zavoli and Power are analogous art because they are from a similar field of endeavor in 
speech-controlled interfaces capable of recognizing segments of a complete utterance. Thus, it 
would have been obvious to a person of ordinary skill in the art, at the time of invention, to 
combine the method of recognition rejection based upon a confidence level as taught by Power 
with the method and system utilizing recognition and display of utterance segments for 
correction initialized through a negative input command as taught by Zavoli to provide a means 
of rejecting an invalid or mispronounced speech input that has a suspect identification, through 
the use of a confidence level score, to ensure that an entered digit sequence is properly 
recognized, especially in a application where digit sequence accuracy is critical such as password 
entry. Therefore, it would have been obvious to combine Power with Zavoli for the benefit of 
obtaining a means of improving recognition accuracy through the use of confidence level scores, 
to obtain the invention as specified in Claims 12 and 27. 

With respect to Claim 15, Zavoli further discloses: 
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A speech recognition system, wherein the speech recognition unit compares the input 
subgroup with stored recognition grammar in order to determine the recognition result (speech 
recognition module featuring a phonetic dictionary, Col 5, Lines 24-28). 

With respect to Claim 16, Zavoli additionally suggests: 

A speech recognition system, wherein the recognition grammar is stored in a remote 
memory accessible by the speech recognition module (invention process implemented on a 
server accessed over telephone lines, Col 3, Lines 15-18), 

Thus, it would have been obvious to one of ordinary skill in the art, at the time of 
invention, to implement a speech recognition method, utilizing a phonetic dictionary, at a server 
in order to conserve system memory in a device with limited storage. 

With respect to Claim 17, Zavoli further recites: 

A speech recognition system, wherein the recognition result includes at least one of a 
subgroup of speech units and a negative utterance representation that is included in the 
recognition result, and wherein the rejection criteria is met if the negative utterance is included 
therein (displaying individual digits to a user, upon recognition, for correction/verification, Col 
6, Lines 54-57, and a rejection result representation displayed to a user, along with previously 
recognized digits, in the form of a deleted digit, Col 6, Lines 54-60). 

Claim 19 contains subject matter similar to Claims 6 and 17, and thus, is rejected for the 
same reasons. 

6. Claims 9 and 24 are rejected under 35 U.S.C. 103(a) as being unpatentable over Zavoli 
et al, in view of Power et al, and in further view of Larsen C Investigating a Mixed-Initiative 
Dialogue Management Strategy, " 1997). 



Application/Control Number: 09/846,200 Page 12 

Art Unit: 2655 

With respect to Claims 9 and 24, Zavoli in view of Power teaches the speech recognition 
system capable of recognizing individual word segments (digits) through pause detection means 
to enable further correction of input and recognition errors, as applied to Claim 1. Neither Zavoli 
nor Power teaches the ability to enter speech units using a dial pad upon repeated recognition 
errors, however Larsen discloses: 

A speech recognition method and system, wherein if said rejection criteria are met 
repeatedly, the user is prompted to use a dial pad to enter the speech units (ability to switch to 
DTMF input mode upon repeated recognition errors, Page 66-67, Application). 

Zavoli, Power, and Larsen are analogous art because they are from a similar field of 
endeavor in speech-controlled interfaces. Thus, it would have been obvious to a person of 
ordinary skill in the art, at the time of invention, to combine the ability to enter speech units in a 
DTMF input mode upon repeated recognition errors as taught by Larsen with the speech 
recognition method and system capable of recognizing individual word segments (digits) through 
pause detection means to enable further correction of input and recognition errors as taught by 
Zavoli in view of Power to offer an alternative means of inputting information in a speech 
interface if a user becomes frustrated with repeated recognition errors. Therefore, it would have 
been obvious to combine Larsen with Zavoli in view of Power for the benefit of offering a user 
an alternative method of data entry in a speech interface upon repeated recognition errors, to 
obtain the invention as specified in Claims 9 and 24. 
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Conclusion 



7. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1. 136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to James S. Wozniak whose telephone number is (703) 305-8669 
and email is James.Wozniak@uspto.gov. The examiner can normally be reached on Mondays- 
Fridays, 8:30-4:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Ivars Smits can be reached at (703) 306-301 1. The fax/phone number for 
the Technology Center 2600 where this application is assigned is (703) 872-9306. 



# 
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. Art Unit: 2655 

Any inquiry of a general nature or relating to the status of this application or proceeding 
should be directed to the technology center receptionist whose telephone number is (703) 306- 
0377. 

James S. Wozniak 



5/18/04 




SUSAN MCFftDDEN 
PRIMARY EXAMINER 



